Speech restoration based on deep learning autoencoder with layer-wised pretraining
نویسندگان
چکیده
Neural network can be used to “remember” speech patterns by encoding speech statistical regularity in network parameters. Clean speech can be “recalled” when noisy speech is input to the network. Adding more hidden layers can increase network capacity. But when the hidden layer size increases (deep network), the network is easily to be trapped to a local solution when traditional training strategy is used. Therefore, the performance of using a deep network sometimes is even worse than using a shallow network. In this study, we explore the greedy layer-wised pretraining strategy to train a deep autoencoder (DAE) for speech restoration, and apply the restored speech for noisy robust speech recognition. The DAE is first pretrained using quasi-Newton optimization algorithm layer by layer in which each layer is regarded as a shallow autoencoder. And the output of the preceding layer is served as the input to the next layer. The pretrained layers are stacked and “unrolled” to be a DAE. The pretrained parameters are served as initial parameters of the DAE which are used to refine training. The trained DAE is used as a filter for speech restoration when noisy speech is given. Noisy robust speech recognition experiments were done to examine the performance of the trained deep network. Experimental results show that the DAE trained with pretraining process significantly improved the performance of speech restoration from noisy input.
منابع مشابه
Speech enhancement based on deep denoising autoencoder
We previously have applied deep autoencoder (DAE) for noise reduction and speech enhancement. However, the DAE was trained using only clean speech. In this study, by using noisyclean training pairs, we further introduce a denoising process in learning the DAE. In training the DAE, we still adopt greedy layer-wised pretraining plus fine tuning strategy. In pretraining, each layer is trained as a...
متن کاملMulti-pretrained Deep Neural Network
Pretraining is widely used in deep neutral network and one of the most famous pretraining models is Deep Belief Network (DBN). The optimization formulas are different during the pretraining process for different pretraining models. In this paper, we pretrained deep neutral network by different pretraining models and hence investigated the difference between DBN and Stacked Denoising Autoencoder...
متن کاملDeep Bottleneck Classifiers in Supervised Dimension Reduction
Deep autoencoder networks have successfully been applied in unsupervised dimension reduction. The autoencoder has a "bottleneck" middle layer of only a few hidden units, which gives a low dimensional representation for the data when the full network is trained to minimize reconstruction error. We propose using a deep bottlenecked neural network in supervised dimension reduction. Instead of tryi...
متن کاملAdvances in Deep Learning
Deep neural networks have become increasingly more popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to the recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks hav...
متن کاملConvergence rates for pretraining and dropout: Guiding learning parameters using network structure
Unsupervised pretraining and dropout have been well studied, especially with respect to regularization and output consistency. However, our understanding about the explicit convergence rates of the parameter estimates, and their dependence on the learning (like denoising and dropout rate) and structural (like depth and layer lengths) aspects of the network is less mature. An interesting questio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012